Progress toward scalable tomography of quantum maps using twirling-based methods 

and information hierarchies 
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We present in a unified manner the existing methods for scalable partial quantum process to- 
mography. We focus on two main approaches: the one presented in Bendersky et al. [Phys. Rev. 
Lett. 100, 190403 (2008)], and the ones described, respectively, in Emerson et al. [Science 317, 1893 
(2007)] and Lopez et al. [Phys. Rev. A 79, 042328 (2009)], which can be combined together. The 
methods share an essential feature: They are based on the idea that the tomography of a quan- 
tum map can be efficiently performed by studying certain properties of a twirling of such a map. 
From this perspective, in this paper we present extensions, improvements and comparative analyses 
of the scalable methods for partial quantum process tomography. We also clarify the significance 
of the extracted information, and we introduce interesting and useful properties of the x-matrix 
representation of quantum maps that can be used to establish a clearer path toward achieving full 
tomography of quantum processes in a scalable way. 
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I. INTRODUCTION 



The number of parameters describing a quantum map 
scale exponentially with In (73), with D the dimension of 
the Hilbert space Hd of the system. One can then argue 
that the resources required to obtain this exponentially 
large number of parameters will also necessarily increase 
exponentially. This is why the complete characterization 
of a quantum map is considered to be a nonscalable task. 
The task of characterizing a quantum map is known as 
quantum process tomography (QPT) 1] and the above 
is the main reason why full QPT is exponentially ex- 
pensive. Moreover, many existing methods have another 
major defect as they are inefficient also in extracting 
partial information about the quantum process (for a 
review, see [2|). Recently, however, several works [3l-fl0j 
have demonstrated that it is possible to extract partial 
but nevertheless relevant information in an efficient way 
[where by efficient we mean that it is done at a cost 
that scales at most polynomially with ln(D))]. This 
has opened a new chapter in quantum information pro- 
cessing toward the scalable characterization of quantum 
processes. These new methods share a common feature. 
They are based on the idea that the relevant properties 
of the quantum map can be obtained by averaging 
properties of a family of maps which are obtained from 
the original one. The averaging is done by an operation 
denoted as twirling [11] (which will be defined in detail 
later) and involves the application of certain operations 
before and after the application of the map. 

In this work we present a review of the recent meth- 
ods for partial QPT, establishing connections between 
them and adding results. We not only present a unifying 
perspective of these methods but also develop a better 
understanding of the problem at hand - the tomographic 



characterization itself. 

The paper is organized as follows: In Sec. [TT] we in- 
troduce the x-matrix description of a quantum process 
distinguishing completely positive (CP) maps and others 
that are not CP. In Sec. [Hi] we present the basic ideas 
behind the notion of a twirling operation. We show that 
the elements of the x-matrix can be obtained from this 
type of operation. Moreover, some important properties 
of such a matrix (in particular, some useful relations be- 
tween diagonal and off-diagonal elements) are discussed 
in Sees. ipfAland lnrA"! 

In Sec. IIVI we review the method of "selective efficient 
quantum process tomography," originally presented in p| 
llOl ] . We reformulate this approach by using more general 
types of twirlings. Not only do we highlight the power of 
this method but also we establish the convenience of one 
type of twirl over another. Furthermore, we provide a 
clear prescription for its implementation when targeting 
the scalable measurement of several x _ma t r i x elements 
at a time. 

In Sec. |V]we move to protocols utilizing simpler forms 
of twirling, which are substantially less demanding re- 
garding their experimental implementation. We take 
the results from |6| and |9| and present them in a new 
compact form as a single protocol enabling us to obtain 
the diagonal elements of the x-matrix grouped by "how 
many" and "which" qubits are affected by the quantum 
map. By fully proving the method by construction, we 
aim to further clarify its simple implementation as well 
as its limitations. 

Finally, in Sec. I VII we discuss the potential of these 
strategies toward achieving scalable complete tomogra- 
phy of a quantum process. We believe that the key to this 
lies in the hierarchization of the exponentially large num- 
ber of parameters (in which the results of Sees. Ill Al and 
IIII Al play an important role). The methods described 
in Sees. IIVI and [V] retrieve the diagonal elements of the 
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X-matrix, and in Sec. I VII we show how the diagonal el- 
ements provide information not only about themselves 
but also about the off-diagonal ones. This is what we 
identify as an information hierarchy. 

Furthermore, we hope that this article sets a practi- 
cal path for experimentalists looking to implement quan- 
tum process characterization in a quantum information 
setting, that is, when the scalability of the tomographic 
method matters. 



II. THE x-MATRIX DESCRIPTION OF A 
QUANTUM PROCESS 

A general quantum process can be described by the 
action of an arbitrary map A on the state p in Hd ■ Any 
linear map A can be expressed as 

D 2 -l 

AGO = E XU'EipEt (1) 

i,i'=0 

where the operators {Ei, I — 0, ... , D 2 — 1} form a basis 
for the space of operators Hd- The complex numbers Xl,l 
form the so-called x-matrix of the map. The x-matrix is 
obviously dependent on the operator basis. Without loss 
of generality, we can take this basis to be orthogonal, that 
is, to be such that Tr[EjE v ] = D5 hV [l2|. It is simple 
to show that the map preserves the hermiticity of p if 
and only if the x matrix is Hermitian itself (i.e., if xi,v — 
X\i i)- Moreover, the map A is trace-preserving if and only 

if the condition J^i v XUl'EpEi = I is satisfied. In such 
a case it is simple to count the number of independent 
real parameters defining the quantum map, which turns 
out to be D A — D 2 (with the trace preserving condition 
implying a reduction of the number of parameters in D 2 
and also implies that the condition xi.i = 1 must be 
satisfied). 

We remark that this description is valid for any linear 
map. For the case of Hermitian maps it is possible to un- 
cover further structure. In such a case the x-matrix can 
be diagonalized by a unitary transformation. In matrix 
notation we can write x = SB, where S is a diagonal 
matrix with real eigenvalues. The columns of the unitary 
matrix B define the eigenvectors: The m-th component 
of the Z-tli eigenvector bi is (6;) m = B m i. By using this 
notation it is evident that the elements of the x matrix 
can be obtained as xi.i' = b\Sb~i> = J2 m B* m iS m , m B m ,v ■ 

Some simple but useful results follow from this expres- 
sion. Replacing it in the original formula for the map 
given in Eq. (p} we obtain the following alternative ex- 
pression for an arbitrary linear Hermitian map: 

A(p) = Y,S k , k A k pAl (2) 

k 

where the operators Ak form an orthonormal basis de- 
fined as Ak = t Ei. (The orthonormality of Ak 



follows from the fact that these operators are a linear 
combination of the original Ei with coefficients that are 
elements of a unitary matrix.) It is worth noticing that 
the coefficients S kk , which are the eigenvalues of the x~ 
matrix, are necessarily real but can be cither positive 
or negative. This representation for the quantum map 
is closely related to the so-called Kraus representation 
(which is obtained only if the eigenvalues S rnm are all 
positive, which is in turn valid for the case of CP maps 
only, as discussed below). In fact, Eq. (J2j) is a general- 
ization of the Kraus representation valid for any linear 
Hermitian map. 

More generally, the above expressions make evident 
that an arbitrary linear Hermitian map can always be 
written as the difference between two CP maps [l3| (this 
is the case since any matrix S can be expressed as the 
difference between two positive matrices). 



A. X" ma t r i x °f completely positive maps 

Using the above results, we can derive some properties 
for the x-matrix of completely positive maps (i.e., when 
the map A and any trivial extension of it to a bigger 
Hilbert space preserve positivity). Since the matrix S is 
positive, it is clear that matrix elements x are obtained 
as the inner product between two eigenvectors bi defined 
through the positive matrix S, 

{b u b v ) = b\Sbv 

From this observation we can conclude the following: 
First, it is evident that diagonal elements must be pos- 
itive (i.e., xm > V Z). Moreover, as the inner product 
satisfies the Cauchy-Schwarz inequality, we can obtain 
the following relation between diagonal and off-diagonal 
elements of the x-matrix: 

\xi,i'\ 2 < Xi.i Xi',i' (3) 

This means that, for any CP map, the diagonal elements 
of the x-matriz ar £ always nonnegative, and that they 
bound the corresponding off-diagonal elements. These 
two simple results are quite significant and they will prove 
very useful later on. Below, we will derive another rela- 
tion between diagonal and off-diagonal coefficients for the 
X-matrix valid for positive (not necessarily CP) maps. In 
this way we will be also able to establish some conditions 
to distinguish these two important classes of maps. 

III. TWIRLING OF A MAP, AND SAMPLING 
OF A TWIRL 

The action of twirling a map is depicted in Fig. [TJ We 
have a quantum process characterized by a map A that 
acts on a system (for example, a quantum information 
processor) originally prepared in an arbitrary state \4>o), 
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as depicted in Fig. QJa). We twirl the map by applying 
an operator U before the map, and an operator after, 
as in Fig. QJb). Typically the twirling is considered as 
the average of this over different elements U , resulting in 
a net map A T , the twirled map. Different families of E/'s 
will return different types of twirl. 



( a ) \4>a)- 



A(|<ji> }«>o|) 



(b) \<t>a) 7^U- A -W > A T (\(j )0 )( ( t> Q \) 



FIG. 1: Circuit representation of (a) the action of a map A; 
(b) the action of the map, now twirled by U. 



In particular, we are initially interested in the Haar 
twirl, 



A HT (p) = / dUU f A(U P U f )U 



(4) 



where dU denotes the unitarily invariant Haar measure 
on U(£>). 

There is a version of this twirl where the average is 
over the Haar measure in state space, 



(M** 1 (\<h)(<h\)\<h) 



#^|A(|V)^|M (5) 



The relation between the two is straightforward if we 
notice that if U is randomly drawn according to the Haar 
measure on operator space, then \ip) = U\<f>o) corresponds 
to the Haar measure on vector space - for any arbitrary 
fixed state |^q). 

There are several previous results concerning the Haar 
twirl, in its forms both in operator space [3, [3] and in 
state space (l5j . Summarizing this literature, we limit 
ourselves to state the following general mathematical for- 
mula: 



J dE/Tr[AiE/ 1 B X U A 2 U ] B 2 U] 



TrL4iA 2 l / r , r , TrfSiS 2 l 
Tr[Ai]Tr[^4 2 ] Tr^Tr^] 



D 2 - 1 



D 



(6) 



for any operators Ax, A 2 , B%, B 2 in Hd- 1 

Given this and explicitely using the trace-preserving 
condition, the ^-matrix elements can be expressed as the 
outcome of a twirl, 

as already stated in [§[. 



A. The x-matrix of positive (but not necessarily 
CP) maps 

Equation ([TJ is valid for any map A under study. In 
particular, for processes that take positive operators into 
positive operators, Eq. J7| defines a valid inner product 

{E U E V )= [ #<V|A( J B, t |^<V|^)|V'> 



Notice that we have {E h Ei) = (D X i,i + 1)/(D + 1) > 0. 
This implies that for a positive (but not necessarily 
CP) map, we have that the diagonal elements of the 
%-matrix can be negative but only up to an exponentially 
vanishing value: ^ —1/D. Also, notice that (Ei,Ei) 
is a survival probability: the probability of the system 
remaining in its initial state after applying the twirled 
map A HT to it. Therefore, (D X li + 1)/(D + 1) < 1, 
which implies xi.i 1- 

Moreover, again using the Cauchy-Schwarz inequality 
on this inner product, we obtain that for / =^ V 



IXM'I 2 < Xi,i XV, V 



Xi,i + Xv, V 
D 



1 



(8) 



So for large systems where we can consider D » 1, the 
off-diagonal matrix elements are effectively bound by the 
diagonal ones. These bounds also suggest that non-CP 
but positive processes are "exponentially close" to CP 
ones. This is an interesting result in the framework of 
open quantum systems, where there are still important 
discussions about what mathematical conditions a phys- 
ical map should fulfill [3 EE [13 . 



B. (Approximate) sampling of a twirl 

Equation (J7J already demonstrates the usefulness of 
twirls in extracting the elements of the ^-matrix. This is 
indeed what lies at the heart of the methods developed 
m It is evident then that we will need to 

implement the twirl experimentally (in either operator or 
state space). Unfortunately, as we will see, the number of 
elements in the twirls that are of our interest is infinite or 
grows exponentially with ln(D) [l8j . Thus implementing 
the twirl perfectly is a nonscalable task. However, it 
was initially suggested in 0, Q that we can approximate 
the twirl by sampling randomly over the family of C/'s, 
say M times, as depicted in Fig. [2j In the case of a 
state twirl, the approach would be the same, since in 
practice a state twirl results from implementing a series 
of operations (which would take the place of the E/'s) on 
a convenient initial state \4>a). 

If we are interested in measuring the probability of 
finding the system in any given state, this outcome will 
be a boolean variable retrieved with a standard devi- 
ation a < l/\/M (following the central limit theorem 
with M oo), so for a desired precision e we must have 
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FIG. 2: Circuit representation of the twirling of A, approxi- 
mated by sampling M times over the elements that constitute 
the family of twirl operators (the C/'s). Each time the system 
is prepared in the same initial state \ 4>o)- The average of these 
M measurements will retrieve the desired probabilities. 



M > e~ 2 . On the other hand, the Chernoff bound tells 
us that for a desired precision e and an error probabil- 
ity S <C 1, we must have M > ln(2/5)/(2e 2 ), which is a 
stronger requirement when S < 2e~ 2 . This is a bound to 
the error probability and not to the error itself, however 
it is rigorous for arbitrary M. In any case M should sat- 
isfy both conditions . Since M is independent of the 
size of the system, this ensures the scalability of the ex- 
perimental implementation if each of the M realizations 
themselves can be implemented efficiently. This holds of 
course unless the targeted probabilities are expected to 
be of the 0(l/vD), in which case the estimation of each 
probability would require an exponentially large number 
of realizations. However this would be the case of a pro- 
cess close to a random channel, and usually they are of 
no interest in quantum information and/or in relatively 
controllable quantum systems. 

Finally, we note that we could separate the average of 
binary outcomes (the result of projective measurements) 
required to determine the probability for an experiment 
with a fixed twirl operator U, from the average of 
experiments with different U's. This is useful in cases 
where repeating an experiment with a fixed U is trivial 
compared to running a new one with a different twirl 
operator. In this case, other interesting bounds to the 
error can be applied, as for example in the experimental 
work in [H. 



In what follows we restrict ourselves to a Hilbert space 
that is an n-fold tensor product of a two-level system 
space, so D = 2™. Moreover, we work with a specific set 
of operators {Ei}: the generalized Pauli operators (also 
called the product operator basis). We will specifically 

denote them as {Pi}, P t = ®" =1 p[ j) , Each p[ j) is an 
element of the Pauli group {I, a x , cr y , a z } for the j-th 
qubit. Pq = I is the identity operator ihHd, and for I > 
at least one factor in each Pi is a Pauli matrix. Notice 
that p} = Pi and that Tr^Pj/] = D6 ltl > indeed. From 
now on, the ^-matrix elements will be always associated 
to this basis. 



IV. METHODS USING A FULL SPACE 
TWIRLING OF THE MAP UNDER STUDY 

In this section we start by studying the methods uti- 
lizing a full twirl over V(D). If the twirl depicted in Fig. 
[5] is over U(D), the survival probability is the average 
fidelity of the original map A [ij [|| [l(| • This is in fact 
Eq. (|5|), which is the definition of average fidelity of a 
quantum channel A [2lj, P(A). 

The tomographic methods in [U, [Tc} are actually pre- 
sented not in terms of twirl operators but rather in 
terms of the states of mutually unbiased bases (MUBs) : 
{\ipj,m)i J = 0, • • • D; m — l,...D}. Here we introduce 
their equivalents using twirls in operator space. Nev- 
ertheless, further analysis will in turn lead to a slight 
preference toward the former one. 

We rely on Q to establish the equivalence between 
the Haar twirl in operator space and the Clifford twirl 
in U(.D) for D > 2 (and we will explicitly prove it for 
D = 2 in Sec. IV)) . On the other hand, the equivalence 
between a Haar twirl in state space and a twirl using 
MUB states, for dimensions that are powers of prime 
numbers, is presented in [22j. Altogether, we can write 

(0 o |A HT (|0o)(0o|)|0o) = 



p^(^|C/A(C^o)(0o|C/)C^o) 



1=1 



D(D + l)j 



(9) 



where the Ci are the Clifford operators in U(Z>) and \4>o) 
is an arbitrary fixed state. Both these twirls imply the 
same cost, as preparing MUB states starting from the 
computational basis and implementing the Ci require the 
same resources: 0(n 2 ) one-qubit and two-qubit gates [Tol . 
HI, And again, the number of Clifford operators \C\ 
scales exponentially with ln(D), as does the number of 
MUB states (so in both cases we will resort to sampling 
the twirl). 

In @,[l0l it was shown how to selectively measure any 
diagonal x-matrix element using an MUB twirl. There 
is an equivalent to this using a Clifford twirl in U(D). 
As presented in Q , if we implement an intermediary ex- 
tra gate Pi before completing the twirl (see Fig. [3]), the 
survival probability is 



Tr[|0 o >^o|Af T (|^o)^o|)] = 



D + l 



(10) 



This can be proven straightforwardly from Eq. ([?])■ 



o)^^U- A -Pi-W >Af T (\^ )(</> Q \) 



FIG. 3: Circuit representation of the action of a map A; with 
A;(p) = PiA(p)P t , twirled by U. 
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We are thus able to measure efficiently one Xl,l at a 
time (selective efficient quantum process tomography - 
SEQPT 0). However, we can modify the protocol to 
automatically select and retrieve the largest xi.V- the 
coefficients such that Xl,l > 2/M. The strategy goes as 
follows. 

We first revisit the method as presented in [8[ for an 
MUB twirl. As depicted in Fig. @|a), we consider a single 
experiment where the system is prepared in a randomly 
chosen MUB state |J, m) = V,/..,„|0). Vj^ m represents 
the change of basis operation between the computational 
state |0) and the targeted MUB state. We measure at the 
end the state in the computational (Zeeman) basis, ob- 
taining then an n-bit string \v ou t) (where v ou t is a boolean 
vector of length n that labels the states as binary num- 
bers). Considering that Vj tm \v out ) — \ J,m') is just an- 
other state of the MUB, and that there are D possible 
Pauli o pera tors that take | J, m) to | J, m') (up to a global 
phase) [10( , we can regard this experiment as equivalent 
to the one in Fig. [3J but where now we have D possible 
Pauli operators playing the role of the intermediary Pi. 

To gain further insight into the mechanism of this re- 
sult, we recall these dynamics using the stabilizer for- 
malism [23| . We describe the state \v out ) with the sub- 
set Bz formed by the n commuting Pauli operators 
{oz , . . . , (Tz^} and a string s out of n signs, ±1, cor- 
responding to the eigenvalues of \v out ) for that subset. 
These n operators generate the maximally commuting 
(Abelian) group of D Pauli operators that stabilize the 
computational basis. On the other hand, the state |0) 
is described by Bz with a string so of all +1 signs. The 
action of Vj, m on |0) is equivalent to changing (Bz, So) to 
(Bj, so), where now Bj is another subset of n commuting 
Pauli operators - the generators of the group that sta- 
bilizes the D states corresponding to the MUB labeled 
by J. Also, the action of Vj, m on \v ut) is equivalent to 
changing (Bz,s out ) to (Bj,s out ). We now use that the 
state (B j, s out ) can be thought as the result of a Pauli 
operator P ou t acting on (Bj,s~q), which leaves us with 
the scheme depicted in Fig. 0Jb). P ou t must fulfill the 
requisite of commuting (anticommuting) with the Pauli 
operators in Bj that have a corresponding +1 (—1) in 
Sout- We express this condition as the commutation re- 
lations 



Pout,V\ m *i»Vj, m ] ± =0, j = l,...,n 



(11) 



where the [ , ]± stands for commutator or anticommu- 
tator, depending on the signs of s ou t- But as already 
stated before, there will be D possible candidates for the 
intermediary P ou t- This can be seen as follows. First, we 
notice that Eq. (TTT1) can be rewritten as [P' out , 0z]± = 
where we have defined 



P'out — Vj.mPoutV, 



t 

J.m 



(12) 



It is easy to see that the possible P' out will be the tensor 
products that have / or a z for the qubits that have +1 in 



Sout, and a x or a y for the other qubits. There are D = 2™ 
of these products, and then the actual Pout's could be 
obtained by inverting Eq. (fT2|) . Therefore, we are indeed 
left with an experiment equivalent to the one in Fig. [3] 
but with D possible intermediary Pauli operators. 

The key here is that for two different sets Bj 1 and 
Bj 2 corresponding to two different MUBs, there can be 
only one P ou t in common for both. This is because the 
D + 1 subsets Bj are obtained by partitioning the D 2 — 1 
nonidentity Pauli operators into D + 1 different subsets 
of D — 1 commuting operators. The Bj are then the 
generators of these subsets (plus the identity). Given 
their properties, any two Bj 1 and Bj 2 , plus commutation 
relations with them [Eq. (jlip j. define a unique operator 

Pout 

Moreover, given the nature of the operators 
involved (Pauli gates and the operations involved in the 
change of basis for MUBs), and the number of equations 
(n), P ou t can be established efficiently [Toj . 

Therefore, if we consider together two experiments 



(Ji,mi,s;j t ) and ( J 2 , m 2 , s out ) [each like in Fig. g^b), 
with Ji ^ J 2 ], there will be only one possible inter- 
mediary Pauli gate compatible with both experiments, 
and it can be computed in a scalable way. In practice, 
we will perform M experiments, and analyzing all 
the possible M(M — l)/2 pairs we will establish the 
intermediary Pauli gates that have occurred at least 
twice, which will be at most 0(M 2 ). Then, we just 
count the number of experiments M + where those op- 
erators have potentially occurred among the D possible 
choices. The corresponding xi,i ean be estimated as 
(Dxli + l)/(D + 1) = M+/M. Notice that is then 
estimated with an standard deviation < \j\f~M. We 
also recall that Y^iXl,l — lj so we can use this to esti- 
mate altogether the magnitude of the smaller coefficients. 



(a) 10} - ^VT^^W ^ } @ \Vout) 



(b) |0} -7 

(c) 10} -7 



Pa, 



} 



'•f 



V out 



(d) |o} -y—c u 



Pan 



}S 10} 



FIG. 4: Circuit representation of equivalent schemes to de- 
termine the largest xi,i> by considering pairs of experiments. 

This strategy can be also applied using Clifford gates 
acting on the initial state instead of using MUB states, 
as depicted in Fig. Q|c). Again, we use the stabilizer 
formalism as described before. Since the Pauli group is 
the normalizer of the Clifford group, indeed CP k tf ^ P k > 
(where = means equal up to a global phase). So this 
means that the action of Cj on a state is equivalent to 
changing (Bz,s) to (Bp,s), where now Bp is another 
subset of n commuting Pauli operators. Again we use 
that the state (Bp,s ou t) can be thought as the result 
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of a Pauli operator P out acting on (Bp,sq), which now 
leaves us in the scheme depicted in Fig. |U[d). Again, 
there are D possible operators that fulfill the requisite 
of commuting (anticommuting) with the Pauli operators 
in Bp that have a corresponding +1 (—1) in s out . The 
argument is completely analogous to the one for the MUB 
twirl. 

We thus resort again on combining two experiments. 
However, the case of two Clifford twirl experiments is 
not as simple as the MUB twirl one. It no longer holds 
that given two experiments there is one single possible 
intermediary Pauli gate, because two different Clifford 
gates may map Bz to two subsets Bp i and Bp 2 that 
generate two Pauli subgroups that have some operators 
in common. So not every pair of experiments, even if 
C\ =/= C2, will be useful toward establishing the above 
the threshold of 2/M. In practice, we should determine 

the 2 x n operators Ck<j*Pc\ (where k = 1,2 are two 
randomly chosen Clifford gates) and check whether they 
constitute two independent sets of generators. If that 
is the case, then there is indeed a unique intermediary 
Pauli gate, as it is always the case with the MUB twirl. 
And thanks to the Gottesman-Knill theorem, this can be 
done efficiently with a classical computer. 

To compare both methods, we consider the probability 
of successfully determining a unique intermediary Pauli 
gate given two different experiments drawn from a pool of 
M experiments. In the case of the MUB twirl, the prob- 
ability of success is Vmub = D/ {D + 1), since there are 
D + 1 possible values of J\, and having randomly with- 
drawn one, there are D different ones we could withdraw 
for J 2 . 

For the Clifford twirl case, this probability can be cal- 
culated as the probability that, given two randomly cho- 
sen maximal Abelian Pauli subgroups, the only common 
element in both groups is the identity, up to a phase. To 
compute this probability we proceed as follows. We fix 
the first maximal Abelian group and then compute the 
probability of adding one by one the operators belong- 
ing to the second group. The fixed group has, up to a 
phase, D— 1 non identity Pauli operators. If we randomly 
choose a non identity Pauli operator, namely Pi, what is 
the probability of it not belonging to the fixed group? It 
is straightforward to see that this probability is D D ^^[ ■ 
Now, from the Pauli operators that commute with P 1; 
what is the probability of picking one Pauli operator P2 
that does not belong to the first group? Again, there 
are a total of D 2 /2 — 2 Pauli operators which commute 
with Pi and are neither Pi nor the identity, but D/2—1 
of those belong to the first group. So the probability of 

this happening is D py^-? • ^ e P rocee d m the same 
way, computing the probability of picking a Pauli opera- 
tor that does not belong either to the first group nor to 
the group generated by the previously chosen operators. 
The product of all those probabilities is the probability 
Vc of having only one intermediary Pauli operator given 



two Clifford twirl experiments: 



V c 



n 

3=0 



I) 1 ■>■ I) 2- 

D 2 1 2i - 23 



(13) 



As shown in Fig. [5j this probability is smaller but asymp- 
totically equivalent to Vmub- For the experiments that 
are being done nowadays, with only a few qubits, the 
MUB twirl still requires much fewer experimental runs 
to obtain the larger coefficients. 




FIG. 5: Success probability of the two methods: 
using an MUB twirl; ■ Vc, using a Clifford twirl. 



Vmub, 



At this point we can conclude that the method intro- 
duced in |8| is indeed the most practical one, and that it 
can retrieve the largest diagonal elements of the %-matrix 
in the sense that they are above the threshold of 2/M, 
for M realizations of the MUB twirl. This is done in a 
scalable way, and with a standard deviation < 1/VA?. 

This protocol has been experimentally implemented 
recently, with photons, to characterize maps on a 
one-qubit space [201 ] - 

Nevertheless, although efficient, the method demands 
the errorless implementation of the Vj, m gates (or of the 
equally demanding Clifford gates) - at least relatively 
errorless compared with any errors in the implementa- 
tion of A. If we have a functional quantum device that 
implements the Hadamard, phase, and controlled-NOT 
(CNOT) gates (which are the gates required to imple- 
ment the Clifford gates Q or the MUB states [ijj) and 
the Pauli operators with enough accuracy, we will be 
in position to study more complex maps with twirls in 
U(-D). If however we are still aiming to study gates and 
sequences whose complexity is comparable to the one of 
a Clifford gate in U(P), this method is unsuitable. In 
this case, a more practical alternative arises from the 
combination of the methods presented in Q and - 
which will be the object of the next Section. This pro- 
posal allows us to establish the diagonal elements of the 
%-matrix coarse-grained in direction. Indeed, this infor- 
mation is particularly useful when seeking information 
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for quantum error correction codes, where the particular 
type of error (a x , cr y or a z ) is irrelevant. 

The following method is experimentally quite less de- 
manding, since it requires a twirl in U(2)®" rather than 
one in U(2"). On the other hand, as we will see, it as- 
sumes a certain structure in the map under study. An 
example of such a scenario was explicitly shown in the ex- 
perimental work in 9], using a liquid-state nuclear mag- 
netic resonance (NMR) processor with four qubits. A 
relatively large number of qubits easily shows the sig- 
nificative difference between required resources; to our 
knowledge, this is the largest number of qubits on which 
a complete 0, [2|| or partial @ quantum process char- 
acterization has been attempted. 



V. METHODS USING A ONE-QUBIT 
TWIRLING OF THE MAP UNDER STUDY 



In this section we concentrate on methods based on 
a one-qubit twirling of a map [H [(| That is, the 
twirl is a tensor product of twirling operators U acting 
on each qubit. The two protocols presented in [6j and 
can be actually merged into one. We review the com- 
pact method by proving it all together, which also shows 
clearly the simplicity and economy of its implementation 
(since both [6j and [9j include their experimental imple- 
mentation) . 

In Sec. IIIII we highlighted the promising role of the 
Haar twirl. This for example motivated the first works 0, 
4]. However, as mentioned before, the work by Dankert 
et al. |H pointed out an equivalence between a Haar twirl 
and a Clifford twirl. 

Rather than starting from the Haar twirl and crossing 
over to the Clifford twirl, we will work directly with the 
Clifford gates and prove everything from scratch. For 
this we will use that the Clifford operators can in turn be 
decomposed into Pauli operators (the normalizer of the 
Clifford group) and the so-called Symplectic operators 
(the resulting quotient group). 

We will follow the notation of |6j. Each index / 
carries the following information: w, v m , \ w . w is the 
Pauli weight of Pi, that is, how many of the factors 
in Pi are nonidentity. The index v w in {!,...,(")} 
counts the number of distinct ways that w nonidentity 
Pauli operators can be distributed over the n factor 
spaces. The index i w is a vector of length w of the form 
Uv — (*1j*2j • • • ,iw) with each component being 1 = x, 
2 = y, or 3 = z to denote which Pauli matrix occupies 
that respective factor position in the tensor product 
forming P\. There are 3 W of these i w for given w and v w . 

We start first with a Pauli twirl (PT) of the map. Thus 



A becomes 



- D 2 -l 

APT 0) = Tfi £ PnMPmpP m )Pn 



(14) 



m=0 

D 2 -l D 2 -l 



52 £ £ XU'PmPlPmpPmPl'Pm (15) 



D' 2 



m=o i,v 



D —1 



Xl,lPlpPl 



(16) 



1=0 



This result was proven in [24j. It can be also seen as 
follows: For 1 = 1', P m PiP m pPmPiPm = Pip Pi since each 
Pi either commutes or anticommutes with each P m . And 
if Z = /', for each j-th factor in which they differ, we have 

P^ P[ j) PM } pPm P[P Pm = ±P[ j) pP^, j \ with each sign 

(i) 

happening for half of the four possible P„\ . Thus they 
cancel out in the sum. 

We consider now a Symplectic one-qubit twirl (SIT) 
of the form 



AS1T (P) = SlA(S mP Sl)S m (17) 

(18) 



m— 1 

\ qU) 

J >->m 



3=1 



where each Sm is an element of the set given by 
{exp(— i(7r/4)<7 p ), p — x,y,z}. It is straightforward to 
show that 

g / , °m u 3°m P°m u 3°m ~ g 
m— 1 

so after a Clifford (Pauli+Symplectic) one-qubit twirl 
(C1T) HI we get 

A ° 1T M SlA PT (S m pSl)S m (19) 

m=0 

n (2) col / \ 

= EE¥Ew^J (20) 



where the collective coefficients are j us t the diag- 

onal ^-matrix coefficients xi,h re-labeled Xw,v m ,i m , after 
disregarding (averaging over) the information given by 



col _ 
Xw,u w — / j Xw,Vw,Ui 



(21) 



This is so far what was presented in [6| , which can also be 
proven as in 0, Q using a different set of tools to handle 
the Clifford twirl as a Haar twirl d, H, [lj] . 

Consider the computational state basis \vh), where Vh 
is a boolean vector of length n and Hamming weight h. 
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(The Hamming weight h of a computational state is just 
the number of ones appearing in its binary representa- 
tion.) The first result we can obtain is that the fidelity 
of a state \vh) undergoing this transformation is indepen- 
dent of the actual state, 



iCIT 



v h ))=Tr[\v h )(v h \A^ l (\v h )(v h \)] (22) 



Vh\PlL 




(23) 



(24) 



(25) 



nonidentity factor P w ,u w ,i w - There will be exactly 2 h of 
these operators for given w and u w , so 



(Zz h h ) 



Prob (vh, ft) 



E 

w=h V* =1 



E 



2\ 

3«j. 



~W,V h +V^, 



(27) 



where Vh indicates a Xw l v w f° r Pauli operators that have 
a nonidentity factor for at least all the qubits whose cor- 
responding component in vh is a one. z/* labels the 
coefficients with w > ft that fulfill this condition. If we 
now discard the "which qubit" information given by Vh, 
summing over all the (?) possibilities, then 



To go from ([23]) to (|24[) . we only need to realize that any 
computational state \vh) is a result of applying a Pauli 
operator P^ 1 (that has <r x where Vh has ones and non- 
identity factors otherwise) to |0). This P^/ 1 will either 
commute or anticommute with P WtUwt i m (and the ± will 
be absorbed by the modulus squared). The last equality 
(|25[) is obtained by realizing that the only nonidentity 
Pw,v w ,i w that takes |0) back to it (up to a global phase) 
is the Pauli operator that has a z in all the positions in- 
dicated by v w (and thus only one of all the possible i w 
given v w and w). 

We must notice that although /(A C1T , \vh)) is then 
equivalent to the average fidelity F(A C1T ) of the process 
A C1T , this is not the average fidelity of the process under 
study, namely F(A) = (L>xo,o + !)/(£> + 1) (c.f. Sec. IT7]) . 
However, this weaker twirl gives a different insight into 
the map structure. The first result we point out, pre- 
sented in Q , is that we can obtain the diagonal elements 
of the x-niatrix grouped by Pauli weight 



(:) 

EE: 



(:) 
E 



col 

A,W,lS-ir 



(26) 



The parameters p w and x£j°i„ are j us t a coarse-graining 
of the diagonal elements of the x-matrix. The p w relate 
to the probability Prob {vh, ft) of obtaining any state \vh) 
with Hamming weight ft when measuring the final state 
A C1T (|0)(0|). We have 



Prob (u/,, ft) 



Tr[|^)(^|A clT (|0)(0|)] 




\Vh 



For (0\P Wtl , w ,i w \vh} to be nonzero (i.e., ±1), v w must in- 
dicate nonidentity factors at least where there are ones 
in Vh (so it must be w > h). Also the ij in i w must be 
1 = x or 2 = y for the qubits with ones in and 3 = z 
for the w — h qubits that have zeros in Vh but have a 



Prob (ft) = ^Prob(ufc,,ft) 



^ h (™-h) (h) 



w—h v h— 1 



- 2 h fw 



= Y — 



w — h 



(il) 

E 



col 



n nh / 



= Y — 

/ 4 OW 



w—h 



(28) 
(29) 

(30) 
(31) 



In this way, all the p w are related to the probabilities 
of measuring an outcome with Hamming weight ft by a 
ri x n matrix Rh, w = fsr^)' as s t a ted in @. 



We can also keep the "which qubit" information and 
use the probabilities Prob (vh, h) constructively to gain 
even more detail. This strategy was already suggested in 
9] but more oriented to ensemble quantum information 
processors. We present it now in a different manner so it 
can be combined with the previous strategy. 

Let us replace the descriptors w and v w by v w , a 
boolean vector of length n and Hamming weight w char- 
acterizing a Pauli operator Pi. v w has a zero in the j'-th 



position if and only if P, 



0) 



i\ otherwise it has a one. 



For example, the operator ai^a^ for n — 4 qubits has 
v 2 = (1,0,1,0). There are of course ££=o O = 2™ = D 
of these vectors describing the Pi . 

If we use Eq. ([2T)l and start with the probability of hav- 
ing all the qubits flipped in the outcome, and go back- 
ward toward the survival probability (i.e., none of the 
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qubits flipped), we find 



2" 



Prob» = - X ? 



col 



Prob (v n -i,n - 1) 



Prob (v n -2,n- 2) 



etc. 



3 

2 n-2 



1 Xt; n _i "I - qn Xv-n 



(32a) 



(32b) 



3n-2 

:-! 



So essentially we could determine Xs°' using (|32a[) . 
then insert it in (|32bp and obtain the n possible X%°^_ 1 
from the different Prob (C ra _i, n — 1 ), and then insert 
that in (|32cj) . and so on and so forth. These equations 
define a triangular matrix that relates the probabilities 
Prob(w/j,/i) to the collective coefficients x%° 1 ■ Notice 
there is no need to perform different experiments to 
obtain the different probabilities: We only need to 
implement M realizations of the twirl and keep the 
outcome of the measurement for each of the realizations. 
This outcome should be a n-bit string indicating whether 
each j-th qubit was found in |0)j or 

The problem arises not in obtaining the experimental 
information, but in its posterior processing. The matrix 
given by eqs. (|3"2"|) is of size D x D, therefore the cost 
of the processing would scale exponentially in n. For 
this strategy to work, it is key to relate it hierarchically 
to the determination of the p w : The experimental 
information required is the same and can be obtained 
efficiently by sampling. The idea goes as follows. If we 
are analyzing a map A that is close to the identity (a 
noise channel) or a quantum gate involving a few qubits 
(typically one or two), then we would expect that above 
a certain cut-off Pauli weight w co , the p w will be null. 
This is a reasonable expectation: Since J2w=oP w = ^ 
(the trace-preserving condition), the p w cannot all be 
arbitrarily large, and thus it will be possible to bound 
the coefficients above the cut-off by a negligible amount. 
In this scenario, the matrix relating the Prob (vh, h) with 

the xt wil1 have a size M oo x M eo , M co = YZ=o (£)> 
which scales polynomially in n [27j . There is a second 
caveat though. As explained in [f| Q respectively, 
the errors in determining the p w or the Xwv sca le 
inefficiently with w, a consequence of the matrices 
relating them with the corresponding probabilities (eqs. 
(I3T1) and (|3"2"|) respectively). Although the measured 
probabilities will have a standard deviation < 1/ 
this error will propagate into the p w or the Xw v w hh a 
factor that grows polynomially with n but exponentially 
with w. Again, we must resort to neglecting the p w 
after a certain cut-off. The system can be arbitrary 
large (arbitrary n), and as long as the p w are negligible 
above a certain w co (with w co independent of or scaling 



efficiently with n) we will be able to obtain all the 
non-negligible Xwv efficiently. 

Notice that, in the previous section, the requirement 
that only a few (<< D) coefficients Xw,v m ,i w are non- 
negligible is not a priori. We can indeed run the pro- 
tocol, efficiently, and arrive at this conclusion. However, 
the one-qubit twirling method poses a stronger condition, 
since the values of the Xw l v w mus t respond to a hierarchy 
associated to their Pauli weights. Only then we can es- 
tablish the Pauli weight cut-off and run the protocol [in 
particular, solve the system of equations (|3"2"1) ]. 

With the twirl in U(D) we obtain the coefficients di- 
rectly with a standard deviation < 1/ \/M , while with the 
twirl in U(2)®" we only obtain probabilities Prob (vh, h) 
with standard deviations < 1/y/M, which still need to 
be propagated in order to obtain the estimated error for 
thexSfX- 

With the protocol of Sec. IIVI [8|, [l0|, the measurement 
of the largest xi,i can be done then more precisely, with 
no coarse-graining and with no restrictions on the ma 
under study. Clearly, the protocol of this section [T 
is quite less demanding, requiring the implementation of 
only 12n one-qubit gates instead of 0(n 2 ) one-qubit and 
CNOT gates. However, this advantage is counterbal- 
anced: We have a more restricted and less precise to- 
mographic method. In practice, nevertheless, the choice 
between the two will be given by the extent to which we 
can control our system experimentally. 

Finally, we must notice that in both approaches, the 
methods are universal in the sense that they do not re- 
quire any prior knowledge on the specific dynamics of 
A. The protocol twirling in full space is valid for any lin- 
ear Hcrmitian map, while the one with one-qubit twirling 
only has the extra requirement of having a structure with 
a cut-off Pauli weight. For an example of a characteri- 
zation incorporating substantial prior knowledge of the 
dynamics or specific models for A, see (28j . 



VI. THE RELEVANCE OF THE DIAGONAL OF 
THE x-MATRIX 

If we diagonalize the %-matrix, we will obtain the 
weights of an operator-sum representation, where the op- 
erators in the sum are the corresponding basis where the 
%-matrix is diagonal. Of course, this basis will not nec- 
essarily be the Pauli operator basis, but in principle a 
combination of them. Using the notation of Sec. [Hi take 
X = SR to be the diagonalization of the x-matrix writ- 
ten in the Pauli operator basis. Let R be the change of 
basis, so 



D —1 



D A -1 



A(p) — J]] Sm^nAmpAln A m — R m ; P; 



m=0 



where the A m form an orthonormal basis, but just as in 
an operator-sum representation, they are not necessarily 
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unitary (otherwise any process would be unital) nor 
Hermitian. And, as we already mentioned, the S m<m 
are real but could be negative in principle. Thus in 
general neither the nor even the S m ^ m have a simple 
interpretation. 

Nevertheless, despite the different ways of describing 
the process under study A in |f| 0, [8j-TlQj] , in all the 
cases they determine specifically the diagonal elements 
of the ^-matrix of the map in the generalized Pauli 
operator basis. Notice that either the one-qubit twirl or 
the full-space twirl implies a Pauli twirl (since the Pauli 
operators are a subgroup of the Clifford group in both 
cases), and that the Pauli twirl erases the information of 
the off-diagonal elements of the %-matrix. We ask then, 
what is the meaning of the diagonal? It was assumed in 
@ that the p w represented the probability of an operator 
of Pauli weight w happening in the process described by 
A. In [9], the were regarded as indicators of the 

locality or range of the process, that is, the probability 
of an operator involving the qubits in v w happening. 
These are both quantities that are relevant to quantum 
error correction and fault-tolerant quantum computing. 

Both these interpretations are fair when the ^-matrix 
in the Pauli operator basis is approximately diagonal, 
at least block-diagonal in blocks characterized by w, v w . 
But that is not generally the case, in particular for maps 
that will be of our interest - such as quantum computing 
gates. For example, the CNOT gate for qubits a and b 
has a ^-matrix with only a 4 x 4 nonzero block, 

Xcnot = 0.25 \ 1 1 1 _ 1 

V-i -i-i i / 

corresponding to P; = I . ai a \ ai b \ ai^ ® a£\ Clearly, 
the off-diagonal coefficients carry critical information 
with equal weight, which for example differentiates the 
CNOT from a depolarizing channel with the same Pi. 

Thus previous interpretations of p w and Xw u m are ar ~ 
guable: We could even have in principle a process in- 
volving a set of qubits given by w, v w that has Xl,l' 7^ 
in that block but Xwu = in the diagonal. However, 
as demonstrated in Sees. Ill Al and IIII Al it is possible 
to draw a relation between the diagonal and off-diagonal 
elements of the x-matrix. 

For CP maps, Eq. ([3]) guarantees that if either xi,l = 
or xi'M = 0, the off-diagonal xi,V is null. And for positive 
maps in general, Eq. ([8]) gives us a bound that is expo- 
nentially close to this result. This a very powerful result, 
since once we have established the nonzero diagonal el- 
ements, in order to perform a full characterization we 
only need to worry about the off-diagonal elements that 
correspond to that resulting block. This hierarchization 
of the information could potentially allow for a complete 
quantum tomography of the process at a scalable cost 



- provided that the number of non-null matrix elements 
turns out to be 0(poly(n)). 

It is in order here to point out though that the 
work in 0, [13] also presents a strategy to measure 
the off-diagonal elements of the ^-matrix. However, 
an ancillary qubit which is not twirled is required for 
this task. The ancilla is assumed to be error-free and 
outside the system we are looking to characterize. This 
does not imply an issue when it comes to scalability, 
since only one qubit ancilla is required for arbitrary D. 
Nonetheless, it puts this method in a different category 
regarding resources and assumptions when it comes to 
its implementation. 



VII. CONCLUSIONS 

By revisiting previous work 0, 0, we have stated 
two scalable approaches for characterizing the diagonal 
elements of the ^-matrix in the Pauli operator basis, for 
any arbitrary quantum process. We emphasize once more 
that the work in Sees. IIVI and [V] arises from the revi- 
sion of these previous results and goes beyond, which 
we would like to summarize here: The general approach 
discussed in Sec. IIVI restates the method originally pre- 
sented in Q , further clarifying its ability to measure the 
largest diagonal elements of the ^-matrix together. We 
study this protocol by recognizing its familiarity with 
other twirling methods and present a natural alternative 
approach which, we conclude, is slightly less convenient if 
we work with only a few qubits. On the other hand, the 
approach discussed in Sec. IVl combines the two protocols 
originally presented in @ and [§], by building and again 
proving both protocols, but simultaneously. 

Furthermore, we have analyzed the two general ap- 
proaches comparatively, establishing their advantages 
and disadvantages: While one is more powerful, the other 
is more realistic from the implementation point of view. 

We have made the point that there are different 
ways of twirling that reproduce Eqs. (jj) and ([5]). 
Moreover, we have shown that a deeper analysis may 
lead to advantages of one form of twirl over another, in 
particular for working with a small number of qubits. 
This is the case in Sec. IIVI when comparing the Clifford 
twirl and the MUB twirl in U(D). Another example of 
this, but twirling in U(2)® n , can be found in g [Igj], 
where it is shown that by carefully choosing the initial 
state of a twirl experiment, it is possible to reduce the 
total number of twirl operators from 12™ to 6™. 

On the other hand, in the light of Eqs. ([3]) and (JSJ), 
our work establishes the relevance of the diagonal coef- 
ficients. We believe that this type of hierarchization of 
the information is key to achieve complete tomography 
in a scalable way. Since the number of parameters is in- 
deed exponentially large, it is necessary to gather them 
or find relations among them, and then design protocols 



11 



that will retrieve information about a whole group in one 
parameter. 

The coarse-grained coefficients of Sec. [V] [Eqs. (|2Tj) 
and (|26p ] represent one example of grouping. When a 
sum of nonnegative elements is null, we can conclude 
that all the elements in the sum are null. On the other 
hand, the bounding of the off-diagonal elements by the 
diagonals also gives us a form of grouping. When a diago- 
nal element is null, we can conclude that all the elements 
corresponding to that row and column are also null. 

If many of the parameters turn out to be null indeed in 
one shot, eventually leaving only poly(n) non-negligible 
ones, these strategies become an efficient way to measure 
all the coefficients. Nevertheless, notice that designing 
methods that retrieve specific partial information is not 
a trivial task, even when we assume that we can neglect 



all the other parameters. We should continue searching 
for bounds and relations between the characterization pa- 
rameters of different types of maps. Also, we should fur- 
ther pursue the design of scalable methods to measure 
subgroups of information, while requiring the protocols 
to rely experimentally on minimum possible resources. 
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